Review Class 7 Kosovo ERW submissions:
To start, load the CDC SVI dataset to QGIS for New York State. This is the same data that will be used for the social vulnerability component of the assignment this week:
The first mapping will be on RPL_THEMES - Overall percentile ranking scaled to 0 > 1:
Mapped via Natural Breaks, 5 classes. Note the -999 class which should be visually subsetted from other values.
Note:
SPL_THEMEScolumn in the attribute table is the Sum of series themes prior to scaling in theRPL_THEMEScolumn.
RPL_THEMES, the -999 value should be removed. This has been completed in the feature set exclude_999:scaled score is readily evident when -999 value has been removed from the dataset
RPL_THEMES column:Statistics Tool
Statistics Profile for variable RPL_THEMES
EP_AGE17:Data Dictionary Listing for column EP_AGE17
Statistics Profile for variable EP_AGE17
EP_AGE17As we’ve seen in Part II, EP_AGE17 has the following minimum and maximum values:
MIN = 0
MAX = 65.1
The universe of values can be anywhere from zero upwards slightly beyond 65. With linear interpolation the goal is to ‘remap’ these original values to (usually) a more standardized, ‘simple’ scale. For our purposes we will map the values to 1 > 10.
scale_linear transforms a given value from an input domain to an output range using linear interpolation.
LIN_S_17, type integer, and populate using a linear scale math function as follows:scale_linear("EP_AGE17", 0, 65.1 , 1, 10)
Field Calculator with scale_linear function
Check resulting remapped values via statistics tool
EP_AGE17Using the formula, the following equation can be applied to a new column in Field Calculator titled MM_17, type decimal:
Formula = X new = (X – X min) / (X max – X min)
As Applied in new column MM_17 = ("EP_AGE17" - 0) / (65.1 - 0)
Field Calculator with min - max formula applied
Statistic Profile for MM_17 column
Min - Max Normalization can also be performed for a custom range, i.e. 1 > 10 as opposed to 0 > 1 as we have done thus far. The formula for this normalization is as follows:
To rescale a range between an arbitrary set of values [a, b], the formula becomes:
Note: a and b are the min-max values of the new normalization in the above formula.
With clustering - also known as group analysis, we may want to find cluster patterns between two variables as follows:
Housing in structures with 10 or more units - E_MUNIT
Persons aged 17 and younger - E_AGE17
In QGIS there is a clustering tool that utilizes several algorithm options. For the demonstration we will use the K Means method in the tool.
First, add the plugin to QGIS - Attribute based clustering:
Attribute based clustering
normalize attributes is toggled ON; in effect this instructs the tool to normalize each attribute so that outliers do not impact the resulting clusters:Attribute based clustering Parameters
class can be mapped via categorical classification:class categorical values mapped
E_MUNIT variable as we know Mid and Lower Manhattan feature nearly only large apartment buildings as housing options:class categorical values mapped